TODO:
bootstrap se’s (bootstrap mer for lme4?)
https://cran.r-project.org/web/packages/sjPlot/vignettes/sjplmer.html for SJPlot
Difference in differences
Other built in data sets (sleepstudy in lme4?)
The first step is that we will load all the packages needed. We are going to be using the ‘haven’ package and the ‘brms’ package.
Notes on the packages
‘haven’: This package is used to read in data that is native to Sas, Stata, or SPSS.
‘brms’: Used to run the multiple membership model. In general, ‘brms’ is a package. which makes running bayesian models in Stan much easier. The syntax for mixed models in ‘brms’ is similar to model syntax in lme4. The machinery behind brms is Stan, and therefore C++. Important: this means that you will also need to install a C++ compiler if you do not already have one. R-tools for windows will do the trick.
# Fits multilevel models
library("nlme")
# Also fits multilevel models
library("lme4")
# Adds p-values to lme4
library("lmerTest")
# Used for algorithms
library('optimx')
# bootstrapping
library("boot")
# Useful for ploting
library("sjPlot")
library("sjmisc")
library("sjlabelled")
# Canned Rstan models (needed for the multiple membership example)
library("brms")
# tidyverse
library("tidyverse")Next we set the working directory. You need to specify the folder path from YOUR computer. More specifically use the folder path where the data is stored.
# Enter the folder path on your computer
setwd('C:/enter/the/folderpath/on/your/computer')We are going to be using a longitudinal example. After reading in the data, we can print the dimensions of the dataset to see if it has the expected number of rows and columns.
# Reading in the data
chapman <- readRDS("longitudinal_chapman.rds")
# Checking the dataset
dim(chapman)## [1] 140 11
Next we print a random subset of 10 rows just so we can look at the data.
| id | score | week | weeksq | Week1 | Week2 | Week3 | Week4 | iq | iq_c | iq_c2 |
|---|---|---|---|---|---|---|---|---|---|---|
| 5 | 190 | 0 | 0 | 1 | 0 | 0 | 0 | 81 | -32.4571 | 1053.46334 |
| 5 | 220 | 1 | 1 | 0 | 1 | 0 | 0 | 81 | -32.4571 | 1053.46334 |
| 5 | 229 | 2 | 4 | 0 | 0 | 1 | 0 | 81 | -32.4571 | 1053.46334 |
| 5 | 220 | 3 | 9 | 0 | 0 | 0 | 1 | 81 | -32.4571 | 1053.46334 |
| 19 | 167 | 0 | 0 | 1 | 0 | 0 | 0 | 120 | 6.5429 | 42.80954 |
| 19 | 201 | 1 | 1 | 0 | 1 | 0 | 0 | 120 | 6.5429 | 42.80954 |
| 19 | 233 | 2 | 4 | 0 | 0 | 1 | 0 | 120 | 6.5429 | 42.80954 |
| 19 | 216 | 3 | 9 | 0 | 0 | 0 | 1 | 120 | 6.5429 | 42.80954 |
The important variables here are:
id: This is our clustering variable.
Score: An individual’s score on an opposites naming task. The specific task is not super relevant here, but more info is in the codebook. Score is going to be our DV.
Week: This is our discrete measure of time. Values range from 0-3.
Let’s briefly look at the data visually. I have randomly sampled 10 individuals and plotted their scores over time. Looks like evidence of random slopes and random intercepts. Note that ‘facet_wrap()’ is a useful feature of ggplot, allowing us make plots by specific individuals.
# Randomly sample 10 id's
set.seed(111111111)
rand <-sample(unique(chapman$id), 10)
# Plot those 10 id's individually over time.
ggplot(data = chapman[chapman$id %in% rand,], aes(x=week, y = score)) +
geom_point() +
ylim(50, 300)+
geom_smooth(method='lm',formula=y~x) +
facet_wrap("id")We are going to use the ‘lmer()’ function from the ‘lme4’ package. First I will fit a random effects anova model. Often use to get the ICC.
\[score_{ij} = \gamma_{00} + u_{0j} + r_{ij}\]
Note that I am calling the ‘lmer()’ function as ‘lme4::lmer()’. Both ‘lme4’ and ‘lmerTest’ use the ‘lmer()’ function. We have both packages loaded and R will default to using the function from the last loaded package. One way to make sure R doesn’t get confused is to add the package name before the function, as I am doing below.
# Fitting the model
fit_m0 <- lme4::lmer(score ~ 1 + (1 | id),
data = chapman,
REML = FALSE,
na.action = na.omit)
# Printing the model summary
summary(fit_m0)## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: score ~ 1 + (1 | id)
## Data: chapman
##
## AIC BIC logLik deviance df.resid
## 1466.1 1475.0 -730.1 1460.1 137
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.3435 -0.6309 -0.0363 0.6178 1.9413
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 574.3 23.96
## Residual 1583.7 39.80
## Number of obs: 140, groups: id, 35
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 204.814 5.265 38.9
# Printing the ICC
tau2 <- 574.3
sigma2 <- 1583.7
ICC <- tau2/(sigma2 +tau2)
paste("The ICC is:", round(ICC, 3))## [1] "The ICC is: 0.266"
Let’s fit a few more models…Let’s see if there is an effect of week. This model would look like:
\[score_{ij} = \gamma_{00} + \gamma_{01}Week_{ij} + u_{0j} + r_{ij}\]
# Fitting a conditional model
fit_m1 <- lme4::lmer(score ~ 1 + week + (1 | id),
data = chapman,
REML = FALSE,
na.action = na.omit)
# summarize model
summary(fit_m1)## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: score ~ 1 + week + (1 | id)
## Data: chapman
##
## AIC BIC logLik deviance df.resid
## 1316.1 1327.9 -654.1 1308.1 136
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.10129 -0.55873 0.08959 0.54603 2.15067
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 877.2 29.62
## Residual 372.3 19.30
## Number of obs: 140, groups: id, 35
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 164.374 5.702 28.83
## week 26.960 1.459 18.48
##
## Correlation of Fixed Effects:
## (Intr)
## week -0.384
But wait, where are the p-values? Well lme4, doesn’t compute p-values. For those desperate for a p-value or a test that a paramter is different than zero, we can either bootstrap them, run the model with the “lmerTest” package, which will add p-values, or we could turn to the “nlme” package which runs multilevel models and does compute p-values.
Note: testing if variance parameters are different than zero is not straight-forward. I recommend against it, or do your reading!
# Bootstrap Standard errors with the 'boot' library
b_par<-bootMer(x=fit_m1,FUN=fixef,nsim=500)
boot.ci(b_par,type="perc", index=1)## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 500 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = b_par, type = "perc", index = 1)
##
## Intervals :
## Level Percentile
## 95% (154.7, 175.1 )
## Calculations and Intervals on Original Scale
boot.ci(b_par,type="perc", index=2)## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 500 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = b_par, type = "perc", index = 2)
##
## Intervals :
## Level Percentile
## 95% (24.03, 29.91 )
## Calculations and Intervals on Original Scale
# Or we could bootstrap with the 'confint' function
confint.merMod(fit_m1, parm=c(3,4), method = "boot", nsim = 500, boot.type = "perc")## Computing bootstrap confidence intervals ...
## 2.5 % 97.5 %
## (Intercept) 152.81492 175.12459
## week 23.81869 30.15804
# Or we could fit the model with "lmerTest::lmer()"
# which uses the same syntax as 'lmer()'
fit_m1 <- lmerTest::lmer(score ~ 1 + week + (1 | id),
data =chapman,
REML = FALSE,
na.action = na.omit)
# summarize model
summary(fit_m1)## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
## approximations to degrees of freedom [lmerMod]
## Formula: score ~ 1 + week + (1 | id)
## Data: chapman
##
## AIC BIC logLik deviance df.resid
## 1316.1 1327.9 -654.1 1308.1 136
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.10129 -0.55873 0.08959 0.54603 2.15067
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 877.2 29.62
## Residual 372.3 19.30
## Number of obs: 140, groups: id, 35
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 164.374 5.702 47.660 28.83 <2e-16 ***
## week 26.960 1.459 105.000 18.48 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## week -0.384
Lets add a random slope and use a likelihood ratio test to see if the random slope significant improves model fit.
\[score_{ij} = \gamma_{00} + \gamma_{01}Week_{ij} + u_{0j} + u_{1j}Week_{ij} + r_{ij}\]
# Adding a random slope for week
fit_m2 <- lme4::lmer(score ~ 1 + week + (1 + week | id),
data = chapman,
REML = FALSE,
na.action = na.omit)
summary(fit_m2)## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: score ~ 1 + week + (1 + week | id)
## Data: chapman
##
## AIC BIC logLik deviance df.resid
## 1287.4 1305.0 -637.7 1275.4 134
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.27251 -0.59921 0.02611 0.61085 1.59094
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 1161.3 34.08
## week 127.7 11.30 -0.45
## Residual 159.5 12.63
## Number of obs: 140, groups: id, 35
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 164.374 6.031 27.26
## week 26.960 2.135 12.62
##
## Correlation of Fixed Effects:
## (Intr)
## week -0.489
# LR test for comparing model 1 vs model 2
anova(fit_m1, fit_m2)## Data: chapman
## Models:
## object: score ~ 1 + week + (1 | id)
## ..1: score ~ 1 + week + (1 + week | id)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## object 4 1316.1 1327.9 -654.06 1308.1
## ..1 6 1287.4 1305.0 -637.68 1275.4 32.747 2 7.746e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Plotting the random intercepts and slopes
sjp.lmer(fit_m2,
type = "re",
facet.grid = FALSE,
#sort.est = "sort.all",
y.offset = .4)sjp.lmer(fit_m2,
type = "re",
facet.grid = FALSE,
sort.est = "sort.all",
y.offset = .4)# Pulling out the random intercepts
ranef(fit_m2)## $id
## (Intercept) week
## 1 32.506129 5.95736738
## 2 51.466599 1.44854554
## 3 -7.447874 16.81722948
## 4 38.862232 -3.91288223
## 5 29.850259 -13.60432576
## 6 1.787951 2.13132503
## 7 -3.225640 4.52460394
## 8 -65.545984 8.53508424
## 9 -25.467672 -3.53111632
## 10 52.782842 0.08614679
## 11 20.524689 -5.75757227
## 12 -63.173175 10.79752028
## 13 -22.202581 -14.86216274
## 14 18.421235 -15.25997889
## 15 -16.154545 -8.48149363
## 16 -44.546921 0.89676568
## 17 -21.008001 12.09556126
## 18 -48.669731 17.77580514
## 19 10.286218 -7.24746611
## 20 -20.255235 3.24843261
## 21 15.323192 12.64574563
## 22 15.210063 -16.47989597
## 23 31.375026 -8.04663371
## 24 17.589785 -15.82655781
## 25 10.511332 -4.18921023
## 26 37.059813 -0.29968625
## 27 -8.906279 -4.18780226
## 28 25.235743 6.81278924
## 29 10.097049 -18.67303098
## 30 39.424327 3.89366726
## 31 -12.495909 -0.50142564
## 32 -32.109721 17.44093878
## 33 6.048776 8.58517478
## 34 -75.806575 12.19430363
## 35 2.652583 -5.02576590
# Plot random slopes and random intercepts simultaneously
sjp.lmer(fit_m2,
type = "rs.ri")Or if you are feeling saucy maybe you want to save values and do have more flexibilty with your plotting. For example, we could take all of the predicted values, and plot them in ggplot2. Maybe we then feed it into plotly to get more of an interactive plot. Get creative!
# adding in the predicted scores
chapman <- as.data.frame(cbind(chapman, predict(fit_m1), predict(fit_m2)))
# Plotting m1 with fixed slopes
p1 <- ggplot(data =chapman, aes(x=week, y=predict(fit_m1),
color = factor(id), group = factor(id))) +
geom_point() +
geom_line()
# Plotting m2 with randome slopes
p2 <-ggplot(data =chapman, aes(x=week, y=predict(fit_m2),
color = factor(id), group = factor(id))) +
geom_point() +
geom_line()
plotly::ggplotly(p1)## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`
plotly::ggplotly(p2)## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`
In many cases you might decide you want to adjust your optimizer
https://cran.r-project.org/web/packages/lme4/vignettes/lmerperf.html
https://www.rdocumentation.org/packages/lme4/versions/1.1-13/topics/lmerControl
# Model with default optimizer
fit_m2 <- lmer(score ~ 1 + week + (1 + week | id),
data = chapman,
REML = FALSE,
na.action = na.omit,
control = lmerControl(optimizer = "bobyqa"))
summary(fit_m2)## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
## approximations to degrees of freedom [lmerMod]
## Formula: score ~ 1 + week + (1 + week | id)
## Data: chapman
## Control: lmerControl(optimizer = "bobyqa")
##
## AIC BIC logLik deviance df.resid
## 1287.4 1305.0 -637.7 1275.4 134
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.27251 -0.59921 0.02611 0.61085 1.59094
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 1161.3 34.08
## week 127.7 11.30 -0.45
## Residual 159.5 12.63
## Number of obs: 140, groups: id, 35
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 164.374 6.031 35.000 27.26 < 2e-16 ***
## week 26.960 2.135 35.000 12.62 1.38e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## week -0.489
# adding the 'nloptwrap' optimizer
fit_m2 <- lmer(score ~ 1 + week + (1 + week | id),
data = chapman,
REML = FALSE,
na.action = na.omit,
control = lmerControl(optimizer = "nloptwrap"))
summary(fit_m2)## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
## approximations to degrees of freedom [lmerMod]
## Formula: score ~ 1 + week + (1 + week | id)
## Data: chapman
## Control: lmerControl(optimizer = "nloptwrap")
##
## AIC BIC logLik deviance df.resid
## 1287.4 1305.0 -637.7 1275.4 134
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.27258 -0.59923 0.02614 0.61081 1.59101
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 1161.7 34.08
## week 127.7 11.30 -0.45
## Residual 159.5 12.63
## Number of obs: 140, groups: id, 35
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 164.374 6.032 34.980 27.25 < 2e-16 ***
## week 26.960 2.135 35.000 12.62 1.38e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## week -0.490
# The 'optimx' package has more optimzers. Here I use the 'nlminb' optimizer
fit_m2 <- lmer(score ~ 1 + week + (1 + week | id),
data = chapman,
REML = FALSE,
na.action = na.omit,
control = lmerControl(optimizer = "optimx", calc.derivs = FALSE,
optCtrl = list(method = "nlminb",
starttests = FALSE, kkt = FALSE)))
summary(fit_m2)## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
## approximations to degrees of freedom [lmerMod]
## Formula: score ~ 1 + week + (1 + week | id)
## Data: chapman
## Control:
## lmerControl(optimizer = "optimx", calc.derivs = FALSE, optCtrl = list(method = "nlminb",
## starttests = FALSE, kkt = FALSE))
##
## AIC BIC logLik deviance df.resid
## 1287.4 1305.0 -637.7 1275.4 134
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.27251 -0.59921 0.02611 0.61085 1.59094
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 1161.3 34.08
## week 127.7 11.30 -0.45
## Residual 159.5 12.63
## Number of obs: 140, groups: id, 35
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 164.374 6.031 35.000 27.26 < 2e-16 ***
## week 26.960 2.135 35.000 12.62 1.38e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## week -0.489
While the lme4 package has some great features. It also has some disadvantages– one disadvantage is that it is not very good for adding in auto-correlation. For that we will use the ‘nlme’ package which makes it much easier to specify autocorrelation
For more info on possible correlation strucures available in ‘nlme’
https://stat.ethz.ch/R-manual/R-devel/library/nlme/html/corClasses.html
# Fitting the same random slopes model - NO autocorrelation
# Note that I have specified missing data actions and an optimizer
fit_m2_nlme <- lme(score ~ 1 + week, random = ~ 1+week|id,
data = chapman,
na.action ="na.omit",
control = lmeControl(opt='optim')) # defaults to 'nlminb'
summary(fit_m2_nlme)## Linear mixed-effects model fit by REML
## Data: chapman
## AIC BIC logLik
## 1278.823 1296.386 -633.4114
##
## Random effects:
## Formula: ~1 + week | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 34.62343 (Intr)
## week 11.50655 -0.45
## Residual 12.62842
##
## Fixed effects: score ~ 1 + week
## Value Std.Error DF t-value p-value
## (Intercept) 164.3743 6.118861 104 26.86354 0
## week 26.9600 2.166604 104 12.44344 0
## Correlation:
## (Intr)
## week -0.489
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.2720636 -0.6003582 0.0263138 0.6073054 1.5843972
##
## Number of Observations: 140
## Number of Groups: 35
# Adding an autoregressive process of order 1.
fit_m2AR_nlme <- lme(score ~ 1 + week, random = ~ 1+week|id,
data =chapman,
na.action ="na.omit",
control = lmeControl(opt='optim'),
correlation = corAR1())
summary(fit_m2AR_nlme)## Linear mixed-effects model fit by REML
## Data: chapman
## AIC BIC logLik
## 1280.682 1301.173 -633.341
##
## Random effects:
## Formula: ~1 + week | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 34.88366 (Intr)
## week 11.65159 -0.451
## Residual 12.02219
##
## Correlation Structure: AR(1)
## Formula: ~1 | id
## Parameter estimate(s):
## Phi
## -0.1084264
## Fixed effects: score ~ 1 + week
## Value Std.Error DF t-value p-value
## (Intercept) 164.41221 6.113990 104 26.89115 0
## week 26.91905 2.156897 104 12.48045 0
## Correlation:
## (Intr)
## week -0.485
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.329126837 -0.619086277 0.007856578 0.645003940 1.589727248
##
## Number of Observations: 140
## Number of Groups: 35
# adding an autoregressive moving average process
# p and q arguments are arbitrary here. I'd think more deeply about
# how you want to specify those
fit_m2AR_nlme <- lme(score ~ 1 + week, random = ~ 1+week|id,
data = chapman,
na.action ="na.omit",
control = lmeControl(opt='optim'),
correlation = corARMA(p=1,q=1))
summary(fit_m2AR_nlme)## Linear mixed-effects model fit by REML
## Data: chapman
## AIC BIC logLik
## 1282.523 1305.941 -633.2617
##
## Random effects:
## Formula: ~1 + week | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 34.60435 (Intr)
## week 11.48149 -0.446
## Residual 12.44719
##
## Correlation Structure: ARMA(1,1)
## Formula: ~1 | id
## Parameter estimate(s):
## Phi1 Theta1
## -0.9525161 0.9126944
## Fixed effects: score ~ 1 + week
## Value Std.Error DF t-value p-value
## (Intercept) 164.45797 6.099994 104 26.96035 0
## week 26.90339 2.152586 104 12.49817 0
## Correlation:
## (Intr)
## week -0.484
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.25965587 -0.60044419 0.03105887 0.62690880 1.58419500
##
## Number of Observations: 140
## Number of Groups: 35
#Does the ARMA autocorrelation improve model fit?
anova(fit_m2_nlme, fit_m2AR_nlme)## Model df AIC BIC logLik Test L.Ratio
## fit_m2_nlme 1 6 1278.823 1296.386 -633.4114
## fit_m2AR_nlme 2 8 1282.523 1305.941 -633.2617 1 vs 2 0.2993853
## p-value
## fit_m2_nlme
## fit_m2AR_nlme 0.861
For the sake of time I am going to skip to the random slopes model, and show how we might fit this in a bayesian framework. To do this is R is actually rather simple. So simple, we could just use the exact same lme4 code with the ‘brm()’ function instead of ‘lmer()’. Just like this.
# Fitting the brms model
fit_m0b <- brm(score ~ 1 + week + (1 + week | id),
data = chapman)But this may be a bit naive, because the ‘brm()’ function has many built in defaults (priors, burnin/warmup iterations, sampling iterations, chains, etc). Below, is the exact same model, but I have made some of the important defaults explicit.
# Fitting the brms model
fit_m2b <- brm(score ~ 1 + week + (1 + week | id),
data = chapman,
prior = NULL,
autocor = NULL,
chains = 4,
iter = 2000,
warmup = 1000)## Compiling the C++ model
## Start sampling
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 3.188 seconds (Warm-up)
## 1.15 seconds (Sampling)
## 4.338 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 3.118 seconds (Warm-up)
## 0.969 seconds (Sampling)
## 4.087 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 2.856 seconds (Warm-up)
## 1.022 seconds (Sampling)
## 3.878 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 3.433 seconds (Warm-up)
## 1.038 seconds (Sampling)
## 4.471 seconds (Total)
summary(fit_m2b)## Family: gaussian(identity)
## Formula: score ~ 1 + week + (1 + week | id)
## Data: chapman (Number of observations: 140)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
## ICs: LOO = Not computed; WAIC = Not computed
##
## Group-Level Effects:
## ~id (Number of levels: 35)
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept) 35.37 4.79 27.29 46.20 1364 1
## sd(week) 11.85 1.89 8.51 15.78 1237 1
## cor(Intercept,week) -0.40 0.16 -0.68 -0.06 1594 1
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept 164.32 6.30 152.12 176.93 614 1
## week 26.93 2.26 22.54 31.36 1355 1
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma 12.95 1.13 10.95 15.34 1787 1
##
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
## is a crude measure of effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Plotting the posterior distrubtions and the effects is especially useful here
# plotting paramters
plot(fit_m2b)stanplot(fit_m2b)That correlation looks different than zero in the posterior distributions but equal to zero in the effects plots? Turns out its just because we are mixing up parameters with very different scales. Lets plot just the correlation effect.
stanplot(fit_m2b, pars = "cor_id")I’m mostly taking a pass on priors, because I’m not particularly savy with Bayesian methods.
Gelman A. (2006). Prior distributions for variance parameters in hierarchical models. Bayesian analysis, 1(3), 515 - 534.
https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations
# Fitting the brms model
fit_m2b_priors <- brm(score ~ 1 + week + (1 + week | id),
data = chapman,
prior = set_prior(prior = "normal(0,10)",
class = "b"),
autocor = NULL,
chains = 4,
iter = 2000,
warmup = 1000
)## Compiling the C++ model
## Start sampling
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 3.026 seconds (Warm-up)
## 1.258 seconds (Sampling)
## 4.284 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 2.508 seconds (Warm-up)
## 0.92 seconds (Sampling)
## 3.428 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 2.647 seconds (Warm-up)
## 0.929 seconds (Sampling)
## 3.576 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 2.578 seconds (Warm-up)
## 1.208 seconds (Sampling)
## 3.786 seconds (Total)
summary(fit_m2b_priors)## Family: gaussian(identity)
## Formula: score ~ 1 + week + (1 + week | id)
## Data: chapman (Number of observations: 140)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
## ICs: LOO = Not computed; WAIC = Not computed
##
## Group-Level Effects:
## ~id (Number of levels: 35)
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept) 35.62 4.82 27.65 46.53 1267 1
## sd(week) 11.99 1.97 8.61 16.37 1091 1
## cor(Intercept,week) -0.41 0.16 -0.68 -0.05 1131 1
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept 166.17 6.38 153.75 178.58 675 1
## week 25.64 2.22 21.21 29.77 1210 1
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma 12.96 1.15 10.93 15.43 1750 1
##
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
## is a crude measure of effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# plotting paramters
plot(fit_m2b_priors)stanplot(fit_m2b_priors)# Fitting the brms model
fit_m2b_AR <- brm(score ~ 1 + week + (1 | id),
data = chapman,
prior = set_prior(prior = "normal(0,10)",
class = "b"),
autocor = cor_ar(~week|id, p=1), # Adding an AR 1 process here
chains = 4,
iter = 2000,
warmup = 1000
)## Compiling the C++ model
## Start sampling
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 2.784 seconds (Warm-up)
## 1.349 seconds (Sampling)
## 4.133 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 2.625 seconds (Warm-up)
## 1.402 seconds (Sampling)
## 4.027 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 2.627 seconds (Warm-up)
## 1.311 seconds (Sampling)
## 3.938 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 3.101 seconds (Warm-up)
## 1.487 seconds (Sampling)
## 4.588 seconds (Total)
# Reading in the data
thai <- readRDS('binary_thai.rds')
# Checking the dataset
dim(thai)## [1] 8582 8
| schoolid | male | pped | rep1 | l_enrc | factotc | mtequalc | msesc |
|---|---|---|---|---|---|---|---|
| 180210 | 1 | 1 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
| 180210 | 0 | 1 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
| 180210 | 0 | 0 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
| 180210 | 0 | 1 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
| 180210 | 1 | 1 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
| 180210 | 1 | 0 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
| 180210 | 1 | 1 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
| 180210 | 0 | 1 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
| 180210 | 0 | 1 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
| 180210 | 0 | 1 | 0 | 0.95 | 0.79 | 0.27 | 0.47 |
Some of the variables we are going to use are
schoolid This is our clustering variable. Students are nested within schools
pped This is a binary indicator of whether or not a student had pre-primary education.
male binary variable for gender
msesc Mean SES of school
# Fitting a null model
fit_m0 <- glmer(pped ~ 1+ (1 |schoolid),
family = "binomial",
data = thai,
glmerControl(optimizer = "bobyqa"))
summary(fit_m0)## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: pped ~ 1 + (1 | schoolid)
## Data: thai
## Control: glmerControl(optimizer = "bobyqa")
##
## AIC BIC logLik deviance df.resid
## 7215.8 7230.0 -3605.9 7211.8 8580
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -5.3048 -0.2980 0.1255 0.4049 4.7030
##
## Random effects:
## Groups Name Variance Std.Dev.
## schoolid (Intercept) 9.13 3.022
## Number of obs: 8582, groups: schoolid, 411
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.3784 0.1567 -2.415 0.0157 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Adding predictors and random slopes
fit_m1 <- glmer(pped~1+male+ msesc+ (1+male|schoolid),
family = "binomial",
data = thai,
glmerControl(optimizer ="bobyqa"))
summary(fit_m1)## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: pped ~ 1 + male + msesc + (1 + male | schoolid)
## Data: thai
## Control: glmerControl(optimizer = "bobyqa")
##
## AIC BIC logLik deviance df.resid
## 6838.0 6880.1 -3413.0 6826.0 8335
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -6.8928 -0.2904 -0.1076 0.4046 4.8674
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## schoolid (Intercept) 6.4034 2.5305
## male 0.1993 0.4465 -0.18
## Number of obs: 8341, groups: schoolid, 399
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.36423 0.14508 -2.510 0.0121 *
## male -0.14345 0.07706 -1.862 0.0627 .
## msesc 3.92377 0.32744 11.983 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) male
## male -0.291
## msesc -0.037 -0.028
#putting confidence intervals around effects
confint(fit_m1, parm = "beta_", method="Wald") ## 2.5 % 97.5 %
## (Intercept) -0.6485839 -0.079872048
## male -0.2944821 0.007576801
## msesc 3.2819954 4.565539692
# Odds ratios instead of log-odds
exp(fixef(fit_m1))## (Intercept) male msesc
## 0.6947328 0.8663618 50.5906880
Optimizers are not as good in R. In fact, I think they are fairly limited from what I have seen. For example, the default number of adaptive quadrature points is 1 (which is equivalent to laplace approximation). We can up the number of quadrature points, but only if our models only have a random intercept. Any additional random slopes and the nADQ cannot be more than 1.
fit <-glmer(pped~1+male+msesc +(1|schoolid),
family = "binomial",
data = thai,
nAGQ = 5,
glmerControl(optimizer ="bobyqa"))But as soon as we add a random slope this model will not run.
fit <- glmer(pped~1+male+msesc +(1+male|schoolid),
family ="binomial",
data =thai,
nAGQ = 5,
glmerControl(optimizer ="bobyqa"))As we did before we can change the optimizer to ‘nloptwrap’.
fit <- glmer(pped~1+male+msesc +(1+male|schoolid),
family ="binomial",
data =thai,
nAGQ = 1,
glmerControl = lmerControl(optimizer = "nloptwrap"))## Warning: extra argument(s) 'glmerControl' disregarded
fit <- brm(pped ~ 1 + male + msesc + (1|schoolid),
family = "binomial",
data = thai,
chains = 2)## Using the maximum of the response variable as the number of trials.
## Only 2 levels detected so that family 'bernoulli' might be a more efficient choice.
## Compiling the C++ model
## Start sampling
##
## SAMPLING FOR MODEL 'binomial(logit) brms-model' NOW (CHAIN 1).
##
## Gradient evaluation took 0.006 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 60 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 132.65 seconds (Warm-up)
## 157.117 seconds (Sampling)
## 289.767 seconds (Total)
##
##
## SAMPLING FOR MODEL 'binomial(logit) brms-model' NOW (CHAIN 2).
##
## Gradient evaluation took 0.004 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 40 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 2000 [ 0%] (Warmup)
## Iteration: 200 / 2000 [ 10%] (Warmup)
## Iteration: 400 / 2000 [ 20%] (Warmup)
## Iteration: 600 / 2000 [ 30%] (Warmup)
## Iteration: 800 / 2000 [ 40%] (Warmup)
## Iteration: 1000 / 2000 [ 50%] (Warmup)
## Iteration: 1001 / 2000 [ 50%] (Sampling)
## Iteration: 1200 / 2000 [ 60%] (Sampling)
## Iteration: 1400 / 2000 [ 70%] (Sampling)
## Iteration: 1600 / 2000 [ 80%] (Sampling)
## Iteration: 1800 / 2000 [ 90%] (Sampling)
## Iteration: 2000 / 2000 [100%] (Sampling)
##
## Elapsed Time: 144.34 seconds (Warm-up)
## 57.212 seconds (Sampling)
## 201.552 seconds (Total)
summary(fit)## Family: binomial(logit)
## Formula: pped ~ 1 + male + msesc + (1 | schoolid)
## Data: thai (Number of observations: 8341)
## Samples: 2 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 2000
## ICs: LOO = Not computed; WAIC = Not computed
##
## Group-Level Effects:
## ~schoolid (Number of levels: 399)
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept) 2.56 0.14 2.29 2.84 390 1
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept -0.35 0.14 -0.63 -0.08 289 1.00
## male -0.15 0.07 -0.28 -0.02 2000 1.00
## msesc 3.90 0.34 3.26 4.57 187 1.01
##
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
## is a crude measure of effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit)stanplot(fit)data("Penicillin")
dim(Penicillin)## [1] 144 3
head(Penicillin)## diameter plate sample
## 1 27 a A
## 2 23 a B
## 3 26 a C
## 4 23 a D
## 5 23 a E
## 6 21 a F
The data are described in Davies and Goldsmith (1972) as coming from an investigation to “assess the variability between samples of penicillin by the B. subtilis method. In this test method a bulk-inoculated nutrient agar medium is poured into a Petri dish of approximately 90 mm. diameter, known as a plate. When the medium has set, six small hollow cylinders or pots (about 4 mm. in diameter) are cemented onto the surface at equally spaced intervals. A few drops of the penicillin solutions to be compared are placed in the respective cylinders, and the whole plate is placed in an incubator for a given time. Penicillin diffuses from the pots into the agar, and this produces a clear circular zone of inhibition of growth of the organisms, which can be readily measured. The diameter of the zone is related in a known way to the concentration of penicillin in the solution.”
p3 <- ggplot(data = Penicillin, aes(y=diameter, x=plate, color = sample, group = sample)) +
geom_point()+
geom_line()
plotly::ggplotly(p3)## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`
fit <- lme4::lmer(diameter ~ 1 + (1|plate) + (1|sample),
data = Penicillin,
REML = FALSE,
na.action = na.omit,
control = lmerControl(optimizer = "bobyqa"))
summary(fit)## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: diameter ~ 1 + (1 | plate) + (1 | sample)
## Data: Penicillin
## Control: lmerControl(optimizer = "bobyqa")
##
## AIC BIC logLik deviance df.resid
## 340.2 352.1 -166.1 332.2 140
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.08031 -0.67227 0.06412 0.58356 2.97933
##
## Random effects:
## Groups Name Variance Std.Dev.
## plate (Intercept) 0.7150 0.8456
## sample (Intercept) 3.1352 1.7706
## Residual 0.3024 0.5499
## Number of obs: 144, groups: plate, 24; sample, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 22.9722 0.7446 30.85
fit <- brm(diameter ~ 1 + (1|plate) + (1|sample),
data = Penicillin,
warmup = 5000,
iter = 10000)## Compiling the C++ model
## Start sampling
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 6.873 seconds (Warm-up)
## 9.935 seconds (Sampling)
## 16.808 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 6.818 seconds (Warm-up)
## 8.843 seconds (Sampling)
## 15.661 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 8.537 seconds (Warm-up)
## 9.862 seconds (Sampling)
## 18.399 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 6.876 seconds (Warm-up)
## 8.183 seconds (Sampling)
## 15.059 seconds (Total)
## Warning: There were 70 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Warning: Examine the pairs() plot to diagnose sampling problems
summary(fit)## Warning: There were 70 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.
## See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Family: gaussian(identity)
## Formula: diameter ~ 1 + (1 | plate) + (1 | sample)
## Data: Penicillin (Number of observations: 144)
## Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 1;
## total post-warmup samples = 20000
## ICs: LOO = Not computed; WAIC = Not computed
##
## Group-Level Effects:
## ~plate (Number of levels: 24)
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept) 0.89 0.15 0.65 1.23 2973 1
##
## ~sample (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept) 2.61 1.11 1.3 5.65 3215 1
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept 22.99 1.21 20.55 25.49 2287 1
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma 0.56 0.04 0.49 0.64 11352 1
##
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
## is a crude measure of effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit)stanplot(fit)From BRMS Whenever you see the warning “There were x divergent transitions after warmup.” you should really think about increasing adapt_delta. To do this, write control = list(adapt_delta =
fit <- brm(diameter ~ 1 + (1|plate) + (1|sample),
data = Penicillin,
warmup = 5000,
iter = 10000,
control = list(adapt_delta = .99))## Compiling the C++ model
## Start sampling
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 22.204 seconds (Warm-up)
## 22.347 seconds (Sampling)
## 44.551 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 24.706 seconds (Warm-up)
## 29.31 seconds (Sampling)
## 54.016 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 22.433 seconds (Warm-up)
## 21.742 seconds (Sampling)
## 44.175 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 22.197 seconds (Warm-up)
## 24.449 seconds (Sampling)
## 46.646 seconds (Total)
## Warning: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.99 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Warning: There were 43 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
## http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
## Warning: Examine the pairs() plot to diagnose sampling problems
summary(fit)## Warning: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.99 may help.
## See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Family: gaussian(identity)
## Formula: diameter ~ 1 + (1 | plate) + (1 | sample)
## Data: Penicillin (Number of observations: 144)
## Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 1;
## total post-warmup samples = 20000
## ICs: LOO = Not computed; WAIC = Not computed
##
## Group-Level Effects:
## ~plate (Number of levels: 24)
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept) 0.89 0.15 0.65 1.23 2901 1
##
## ~sample (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept) 2.63 1.2 1.28 5.73 3768 1
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept 22.97 1.18 20.66 25.31 3034 1
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma 0.56 0.04 0.49 0.64 12129 1
##
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
## is a crude measure of effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit)stanplot(fit)But I couldn’t get this one to run without any errors :/
cross <- read.csv("C:/users/mgiordan/desktop/crossclassified.csv")
dim(cross)## [1] 11200 13
kable(head(cross))| id | year | policy | state | citi_ideo | time | legprofess | popdens | diffus | ideodist | ideo_zero | outcome | dem |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1994 | 1 | alabama | 39.415 | 1 | 0.158 | 83.2 | 0 | 13.3256 | 9.534531 | 0 | 1 |
| 1 | 1994 | 13 | alabama | 39.415 | 1 | 0.158 | 83.2 | 0 | 0.0000 | 9.534531 | 0 | 1 |
| 1 | 1994 | 4 | alabama | 39.415 | 1 | 0.158 | 83.2 | 0 | 0.0000 | 9.534531 | 0 | 1 |
| 1 | 1994 | 7 | alabama | 39.415 | 1 | 0.158 | 83.2 | 0 | 0.0000 | 9.534531 | 0 | 1 |
| 1 | 1994 | 10 | alabama | 39.415 | 1 | 0.158 | 83.2 | 0 | 0.0000 | 9.534531 | 0 | 1 |
| 1 | 1994 | 9 | alabama | 39.415 | 1 | 0.158 | 83.2 | 0 | 0.0000 | 9.534531 | 0 | 1 |
fit <- lme4::glmer(outcome ~ 1 + (1|state) + (1|policy),
family = "binomial",
data = cross)
summary(fit)## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: outcome ~ 1 + (1 | state) + (1 | policy)
## Data: cross
##
## AIC BIC logLik deviance df.resid
## 2856.2 2877.7 -1425.1 2850.2 9370
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -0.4944 -0.2191 -0.1678 -0.1274 10.1235
##
## Random effects:
## Groups Name Variance Std.Dev.
## state (Intercept) 0.4495 0.6705
## policy (Intercept) 0.3580 0.5983
## Number of obs: 9373, groups: state, 50; policy, 14
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.4432 0.1969 -17.49 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
fit <- lme4::glmer(outcome ~ 1 + time + (1|state) + (1|policy),
family = "binomial",
data = cross)
summary(fit)## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: outcome ~ 1 + time + (1 | state) + (1 | policy)
## Data: cross
##
## AIC BIC logLik deviance df.resid
## 2575.4 2604.0 -1283.7 2567.4 9369
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -0.9184 -0.2055 -0.1238 -0.0714 30.2225
##
## Random effects:
## Groups Name Variance Std.Dev.
## state (Intercept) 1.033 1.0165
## policy (Intercept) 0.738 0.8591
## Number of obs: 9373, groups: state, 50; policy, 14
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.82404 0.33103 -17.59 <2e-16 ***
## time 0.24133 0.01536 15.71 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## time -0.537
fit <- lme4::glmer(outcome ~ 1 + citi_ideo + legprofess + popdens +
diffus + ideodist + ideo_zero + dem + time +
(1|state) + (1|policy),
family = "binomial",
data = cross)I don’t have data to show for these but I thought I’d share general syntax if you were to do some count models. Notice for poisson we only change the family argument.
# poisson
fit <- lme4::glmer(y ~ 1 + x1 + x2 + (1 | groupvar),
family = "poisson",
data = mydata)Negative binomial might be possible, but you could be better off in something like stata too…
Note from the lme4 package: “Parts of glmer.nb() are still experimental and methods are still missing or suboptimal. In particular, there is no inference available for the dispersion parameter ?? , yet.”
# Negative binomial
fit <- lme4::glmer.nb(y ~ 1 + x1 + x2 + (1 | groupvar)
data = mydata)The ‘brms’ package also supports a large number of distributions and link functions for the response variable.
see section on ‘brmsfamily’: https://cran.r-project.org/web/packages/brms/brms.pdf
# Reading in the data
nurses <- readRDS("mm_nursing.rds")Checking the dimensions of the dataset we have 1000 cases and 66 variables
# Checking the dataset
dim(nurses)## [1] 1000 66
I am going to subset only the variables of interest
nurses <- nurses[, c("patient", "satis", "assess",
"n1st", "n2nd", "n3rd", "n4th",
"p1st", "p2nd", "p3rd", "p4th")]and we can print some of the raw data to see what it looks like
# Set a seed and randomly sample from rows 1-1000.
# The random sample here isn't important, only useful for displaying data.
set.seed(872017)
randomRows <- sample(1:1000,10)
# Print the randomly sampled rows
kable(head(nurses[randomRows,], n = 10))| patient | satis | assess | n1st | n2nd | n3rd | n4th | p1st | p2nd | p3rd | p4th |
|---|---|---|---|---|---|---|---|---|---|---|
| 529 | -0.0569810 | -0.3350042 | 25 | 11 | 0 | 0 | 0.33 | 0.67 | 0.00 | 0 |
| 126 | 0.1723346 | -1.1262732 | 17 | 0 | 0 | 0 | 1.00 | 0.00 | 0.00 | 0 |
| 319 | 1.4107995 | -0.8070087 | 11 | 0 | 0 | 0 | 1.00 | 0.00 | 0.00 | 0 |
| 422 | 1.0342109 | 1.4629703 | 3 | 17 | 0 | 0 | 0.74 | 0.26 | 0.00 | 0 |
| 432 | -1.2950493 | -1.4369467 | 8 | 20 | 0 | 0 | 0.51 | 0.49 | 0.00 | 0 |
| 336 | 0.0504741 | -0.4920052 | 5 | 0 | 0 | 0 | 1.00 | 0.00 | 0.00 | 0 |
| 370 | -0.0277822 | -1.1063883 | 21 | 0 | 0 | 0 | 1.00 | 0.00 | 0.00 | 0 |
| 560 | 0.1216989 | -1.3557973 | 12 | 23 | 0 | 0 | 0.61 | 0.39 | 0.00 | 0 |
| 711 | -0.8439955 | -1.4817982 | 24 | 2 | 14 | 0 | 0.41 | 0.46 | 0.13 | 0 |
| 867 | -1.4477801 | -1.5170703 | 15 | 8 | 12 | 0 | 0.48 | 0.35 | 0.17 | 0 |
Variables
So the general idea for this model is that each individual patient is nested within multiple nurses…which is a multiple membership model! We are going to weight the effect of each nurse by the proportion of time spent with each nurse.
Unfortunately, our usual multilevel modeling packages such as ‘lme4’ and ‘nlme’, cannot fit multiple membership models. So we turn to the bayesian package ‘brms’ which makes fitting these in R possible. I am using defaults for priors, chains, and I am setting the burn-in and iteration. If you would like to change these or do not know what these are I suggest reading some of the documentation for ‘brms’.
# model syntax
# This will spit out lots of information about the model fitting process
fit_mm <- brm(satis ~ 1 + assess + (1 | mm(n1st, n2nd, n3rd, n4th,
weights = cbind(p1st, p2nd, p3rd, p4th))),
data = nurses,
warmup = 5000,
iter = 10000)## Compiling the C++ model
## Start sampling
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 20.073 seconds (Warm-up)
## 18.607 seconds (Sampling)
## 38.68 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 18.555 seconds (Warm-up)
## 17.118 seconds (Sampling)
## 35.673 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 18.2 seconds (Warm-up)
## 17.215 seconds (Sampling)
## 35.415 seconds (Total)
##
##
## SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 4).
##
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
##
##
## Iteration: 1 / 10000 [ 0%] (Warmup)
## Iteration: 1000 / 10000 [ 10%] (Warmup)
## Iteration: 2000 / 10000 [ 20%] (Warmup)
## Iteration: 3000 / 10000 [ 30%] (Warmup)
## Iteration: 4000 / 10000 [ 40%] (Warmup)
## Iteration: 5000 / 10000 [ 50%] (Warmup)
## Iteration: 5001 / 10000 [ 50%] (Sampling)
## Iteration: 6000 / 10000 [ 60%] (Sampling)
## Iteration: 7000 / 10000 [ 70%] (Sampling)
## Iteration: 8000 / 10000 [ 80%] (Sampling)
## Iteration: 9000 / 10000 [ 90%] (Sampling)
## Iteration: 10000 / 10000 [100%] (Sampling)
##
## Elapsed Time: 19.208 seconds (Warm-up)
## 16.641 seconds (Sampling)
## 35.849 seconds (Total)
# Printing the model parameters
summary(fit_mm)## Family: gaussian(identity)
## Formula: satis ~ 1 + assess + (1 | mm(n1st, n2nd, n3rd, n4th, weights = cbind(p1st, p2nd, p3rd, p4th)))
## Data: nurses (Number of observations: 1000)
## Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 1;
## total post-warmup samples = 20000
## ICs: LOO = Not computed; WAIC = Not computed
##
## Group-Level Effects:
## ~mmn1stn2ndn3rdn4th (Number of levels: 26)
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept) 0.55 0.09 0.4 0.76 3210 1
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept -0.03 0.11 -0.25 0.19 2023 1
## assess 0.49 0.03 0.44 0.54 20000 1
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma 0.77 0.02 0.73 0.8 20000 1
##
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
## is a crude measure of effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Plotting results
plot(fit_mm)stanplot((fit_mm))